Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive deletion of members on deletion of container #132

Open
fabiancook opened this issue Jun 9, 2019 · 18 comments
Open

Recursive deletion of members on deletion of container #132

fabiancook opened this issue Jun 9, 2019 · 18 comments

Comments

@fabiancook
Copy link

Opening this issue to start discussion on moving forward with recursive deletion of members within a containers. Related to solid/solid-spec#172

I propose that we allow this from a specification level, this will allow for agents to remove a container without first reading & then deleting all its members.

An example of when this would be useful is seen when an agent wants to no longer utilise an application that had previously been given write access to a specific folder within the agents storage pod, the user may not have rights to all directories within that container, but its still their storage pod to do whatever they please with. If we allow recursive deletion, it means that the agent doesn't need to have read or write access to those member resources, but can still delete the entire folder.

@fabiancook
Copy link
Author

fabiancook commented Jun 9, 2019

There are to parts to this conversation:

  • The technical side where it may not be best to delete all contained resources due to performance etc
  • The WAC side where it may not be correct to allow an agent to delete a container if its members cannot be deleted by the agent

The technical limitation is what solid/solid-spec#172 is leaning on, with a mention of WAC as well.

Maybe the question should be:

If an agent "owns" a container, can they delete all data contained within that container, given its theirs

I have a feeling it leads into issues where pod providers are charging users a premium for storage, and a resource becomes locked by an external agent (like an app storing secrets related to the user, in the agents storage), and the pod provider can't just "unlock" or delete that file for them, so they're stuck, with a perpetual bill for a locked file.

@akuckartz
Copy link

If we allow recursive deletion, it means that the agent doesn't need to have read or write access to those member resources, but can still delete the entire folder.

That sounds very problematic. Therefore 👎

@fabiancook
Copy link
Author

Thats the reason I created this issue, to discuss possible issues & alternatives.

It would be great to know the areas in which you think it would be problematic, and other suggestions as to how it should work.

@acoburn
Copy link
Member

acoburn commented Jun 9, 2019

Another significant issue with recursive deletes has to do with how it deals with partial failure. Does recursive delete imply some sort of atomic and/or transactional processing on the backend? If so, that would be problematic for implementing Solid at scale.

@bourgeoa
Copy link
Member

I think that this relates to the need for a pod owner to be able to backup part or all of his pod.
I must be able to own my backup.
That is not possible if I, as owner, have no access to some of the ressources.

@RubenVerborgh
Copy link
Contributor

the need for a pod owner to be able to backup part or all of his pod

Important remark: from a technical perspective, there is no such concept as a “pod owner”. A pod is a collection of resources with access control, and some agents have certain permissions to certain resources. There is technically speaking no special role for the person that we would colloquially indicate as an owner, since that person can easily give other people Control permissions to resources and then be denied access.

@bourgeoa
Copy link
Member

I thought that the pod owner was the pod webId creator. And he allways exist even if he is denied access to some ressources. He could never been denied access to root pod ressources.

@RubenVerborgh
Copy link
Contributor

I thought that the pod owner was the pod webId creator.

In practice, in some deployments, yes, but the rights deriving from that are up to the provider, and not part of the Solid technical spec.

But note that “pod WebID creator” in general does not exist.
Take my example, for instance:

  • My WebID is https://ruben.verborgh.org/profile/#me
  • One of my pods is https://drive.verborgh.org/
  • I happen to have Control permissions to the root folder on that pod
  • But if anyone else were to be given those permissions, then there is no distinction between them and me from a Solid spec perspective.
  • However, since I also happen to control the server at drive.verborgh.org, I do have additional powers. But, importantly, those powers do not come from the Solid spec.

And he allways exist

So not necessarily, from a Solid spec perspective.

It is perfectly possible to create a pod with folders A and B, where only X has Control access to A and only Y has Control access to B. Neither A nor B can be called an owner, not technically (because the concept doesn't exist at that level), nor otherwise (because neither of them has access to everything).

He could never been denied access to root pod ressources.

Anyone can be denied access.

@bourgeoa
Copy link
Member

I understand your point except on what I call root pod ressources. May be a very bad wording. The contact point between solid spec and real.

It may not be a spec problem, but is a technical implementation one.

I cannot trust my pod If I cannot delete or backup it. There must be something between pod provider and whoever control pod ressources, or else it seems that we are avoiding the pod concept that need to have some properties on a conceptual or practical point of view

@RubenVerborgh
Copy link
Contributor

I cannot trust my pod If I cannot delete or backup it. There must be something between pod provider and whoever control pod ressources

I understand and agree with this; but that agreement is outside of the Solid spec.

@bourgeoa
Copy link
Member

Then I may also agree that it has to do with the implementation of node-solid-server, and I see nothing in the actual project for that.

Where could there be a discussion and spec on that aspect which may or may not have practical implication with the solid spec implementation on node-solid-server ? Owner properties management ( change of mail recovery address, ... and pod management).

@RubenVerborgh
Copy link
Contributor

There probably needs to a separate specification for user management.

@michielbdejong
Copy link
Contributor

Thanks for writing this up @fabiancook, as you know my vote on this is to leave deletion as non-recursive.

@kjetilk
Copy link
Member

kjetilk commented Jun 17, 2019

the need for a pod owner to be able to backup part or all of his pod

Important remark: from a technical perspective, there is no such concept as a “pod owner”. A pod is a collection of resources with access control, and some agents have certain permissions to certain resources. There is technically speaking no special role for the person that we would colloquially indicate as an owner, since that person can easily give other people Control permissions to resources and then be denied access.

Sliding OT here, but I just wrote up solid/user-stories#42 because clearly, that user can then give up control, leading to loss of control. Files can be corrupted and so I. I remember starting a list of such loss-of-control situations some time ago, but I seem to have lost it.

One possible path would be to introduce the owner concept.

@RubenVerborgh
Copy link
Contributor

Also need to consider security, in https://github.com/solid/solid-spec/issues/196

@kjetilk kjetilk transferred this issue from solid/solid-spec Dec 5, 2019
@kjetilk
Copy link
Member

kjetilk commented Dec 5, 2019

As part of #41 , we should continue discussion here.

The way that I see this, there are several analog functions we need to consider:

  1. DELETE on container deletes all the resources in the container, but does not recurse, i.e. rm container/* ; rmdir container/
  2. DELETE on container recurses into the container hierarchy and deletes resources if it has acl:Write on the resources in the container hierarchy. Corresponding to rm -r.
  3. DELETE on container recurses into the container hierarchy, and deletes as above, but deletes also if it has acl:Control, since acl:Control means it could have changed the permission if it wanted to. This would roughly correspond to rm -rf.

To me, the attraction of recursive deletion boils down to two aspects:

  1. It saves work, since the client doesn't have to recurse.
  2. It can be done atomic

Out of these, I think the second is actually the most important. It can be hard to do at scale, but I see it as even harder for the client to do it. The possibility of atomic deletes is important.

However, recursive deletion is also possibly a very destructive operation. We should have a mechanism so that the user can ensure that it doesn't happen by accident. It could prompt email confirmation, reconfirmation of credentials, UIs that pop up with a message or something (I thought I had an issue open for that, but couldn't find it now).

Edit: Update the first analogy

@csarven
Copy link
Member

csarven commented Nov 20, 2020

Putting this down FWIW as a possible solution:

Noting HTTP Extensions for WebDAV's DELETE for Collections https://tools.ietf.org/html/rfc4918#section-9.6.1 uses Depth header https://tools.ietf.org/html/rfc4918#section-10.2 . What's immediately interesting/relevant:

The DELETE method on a collection MUST act as if a "Depth: infinity"
header was used on it. A client MUST NOT submit a Depth header with
a DELETE on a collection with any value but infinity.

And looking at request to DELETE a container in Solid:

When a DELETE request is made to a container, the server MUST delete the container if it contains no resources. If the container contains resources, the server MUST respond with the 409 status code and response body describing the error.

Solid is not building on WebDAV's DELETE but it might be able to use the Depth header.

One - possibly out of scope - issue may be that a server implementing Solid and WebDAV may be confronted with a conflicting behaviour ie. Solid's DELETE on container resource will not recursively delete but WebDAV's DELETE will. Those systems will have to make a decision.

Solid's DELETE on container acts like as if Depth: 0 is in the request.

Solid client using Depth with value other than infinity will be rejected by WebDAV.

Solid client using DELETE with Depth: infinity is a non-issue re server also supporting WebDAV as that will be the default behaviour for WebDAV any way.


While this could technically address client's intention to recursively delete a container, it comes down to finding the right level of requirement for servers to accept Depth: infinity, if at all supported. Perhaps along these lines:

When a server supports recursive deletion of containers, it MUST do so by accepting requests including the HTTP Depth header with a value of infinity.

Security Consideration:
Asking for recursive operations on large collections can attack processing time.

@acoburn
Copy link
Member

acoburn commented Nov 20, 2020

One thing to note about WebDAV, is that any conforming WebDAV server MUST include a DAV header on all OPTIONS responses, which makes it possible for clients to know if they are dealing with a WebDAV server or not.

If a WebDAV server is at compliance level 1, it needs to support all MUST requirements of the WebDAV spec. Compliance level 2 includes locking.

Trellis, for example, supports WebDAV via an extension module, so it can act as both an LDP server and a WebDAV server (compliance level 1). When WebDAV support is enabled, then DELETE operations are recursive; otherwise, they are not.

One important consideration about WebDAV, though: it is very XML-centric, particularly the PROPPATCH and PROPFIND methods, which an implementation needs to find a way to map to RDF-based properties. 207 Multistatus responses are also XML-based.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Consensus Phase
Specification
  
Consensus Phase
Development

No branches or pull requests

8 participants